Main
Jonatan Pallesen
I am a data scientist with a PhD in genomics. I’m curious by nature, and enjoy the challenge of separating signal from noise to gain real insights.
Throughout my education and work I have acquired extensive theoretical and practical experience with data handling, statistics, machine learning, visualization and algorithms. I am a very skilled programmer in Python, R and Julia, with more than ten years of experience in the former two.
During my work as statistical analyst and as data scientist I have worked on a variety of different projects, for which I have been in charge of all stages of the analysis, from problem definition, data cleaning and quality control, to statistical analysis, model building and presentation.
Work Experience
Data scientist
Raven biosciences
N/A
2019
- Lead data scientist
- Working with a variety of projects in education, fintech and automated machine learning.
Statistical analyst
Aarhus university
N/A
2018 - 2015
- Programming pipelines and tools, working with very large data sets, statistical analysis and machine learning.
Education
PhD, human genetics
Aarhus university
N/A
2015 - 2011
- Thesis: Association studies of psychiatric disorders: On association of genes, gene sets and runs of homozygosity.
Visiting researcher
University of California, Berkeley
N/A
2012
MSc, molecular biology and computer science
Aarhus university
N/A
2011 - 2004
Selected public data science projects
I regularly make new analyses and visualizations on my blog
How to transform your data
N/A
N/A
2019
- Using simulations to determine the optimal transformation for skewed variables
Blind auditions and gender discrimination
N/A
N/A
2019
- Re-analysis of a seminal study. (Used for an article in the Wall Street Journal)
Selected Publications
Discovery of the first genome-wide significant risk loci for attention deficit hyperactivity disorder
Nature Genetics. (link)
N/A
2019
- Demontis et al.
Identification of common genetic risk variants for autism spectrum disorder
Nature Genetics. (link)
N/A
2019
- Grove et al.
LandScape: a simple method to aggregate p-values and other stochastic variables without a priori grouping
Statistical Applications in Genetics and Molecular Biology. (link)
N/A
2016
- Joint first author with Carsten Wiuf.